Korean Compound Noun Indexing Based on Lexical Association and Conceptual Association

نویسندگان

  • Bo-Hyun Yun
  • Jin-Dong Kim
  • Hae-Chang Rim
چکیده

Conventional methods have dealt with compound nouns with a goal to enhance the recall. That is, they've extracted only unit nouns within a full compound noun but didn't take out head modiiers which can improve the precision. This paper presents a new method of the Korean compound noun indexing which can improve both the recall and the precision. Our method extracts head modiiers from a compound noun by using lexical association and conceptual association. To enhance the recall, the unit nouns of composing compound nouns are extracted as indexing terms. To increase the precision, head modiiers within a full compound noun are obtained by the weighted sum of lexical association and conceptual association. In the experiment, we try to evaluate the eeectiveness of using head modiiers to supplement single nouns for indexing. The experiment data are 2,600 documents and 30 queries in KTSET 2.0. Experimental results show that the proposed method yields a retrieval performance better than previous techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Object and Action Naming: A Study on Persian-Speaking Children

Objectives: Nouns and verbs are the central conceptual linguistic units of language acquisition in all human languages. While the noun-bias hypothesis claims that nouns have a privilege in children’s lexical development across languages, studies on Mandarin and Korean and other languages have challenged this view. More recent cross-linguistic naming studies on children in German, Turkish,...

متن کامل

A Hybrid Approach for Bracketing Noun Sequence

For a resource poor language like Hindi, it becomes very difficult to bracket a noun sequence using approaches which are only based on corpus or lexical database. For semantic knowledge, power of both type of resources is needed to be combined. Therefore, affinity in between two nouns is preferred to be measured using backoff association which is the combination of lexical and conceptual associ...

متن کامل

Compound Noun Segmentation Based on Lexical Data Extracted from Corpus

Compound noun analysis is one of the crucial problems in Korean language processing because a series of nouns in Korean may appear without white space in real texts, which makes it difficult to identify the morphological constituents. This paper presents an effective method of Korean compound noun segmen-tation based on lexical data extracted from corpus. The segmentation is done by two steps: ...

متن کامل

Deverbal Compound Noun Analysis Based on Lexical Conceptual Structure

This paper proposes a principled approach for analysis of semantic relations between constituents in compound nouns based on lexical semantic structure. One of the difficulties of compound noun analysis is that the mechanisms governing the decision system of semantic relations and the representation method of semantic relations associated with lexical and contextual meaning are not obvious. The...

متن کامل

Developing a Semantic Similarity Judgment Test for Persian Action Verbs and Non-action Nouns in Patients With Brain Injury and Determining its Content Validity

Objective: Brain trauma evidences suggest that the two grammatical categories of noun and verb are processed in different regions of the brain due to differences in the complexity of grammatical and semantic information processing. Studies have shown that the verbs belonging to different semantic categories lead to neural activity in different areas of the brain, and action verb processing is r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997